In the capstone project I will try to find a good location for a Indian restaurant in Manhattan. Specifically, this report will be targeted to stakeholders interested in opening an Indian restaurant in Manhattan, NY.
Since there are 2874 restaurants in Manhattan I will try to find locations
With Data Science I will try to find and present to the stakholders the most promissing neigborhoods of Manhattan where to open up a Indian restaurant.
Based on the definition of the Business Problem, the decsission will be influenced by the following factors:
To find the most promissing neighborhoods to open up a Indian restaurant in Manhattan I will use the following data sources:
#!conda install -c conda-forge folium --yes
#!conda install -c conda-forge geopy --yes
import numpy as np
import pandas as pd
import json
from geopy.geocoders import Nominatim
import requests
from pandas.io.json import json_normalize
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
import folium
import pyproj
import math
print('All nessecary Libraries imported!')
Load New York dataset about neigborhoods
!wget -q -O 'newyork_data.json' https://cocl.us/new_york_dataset
print('Data downloaded!')
with open('newyork_data.json') as json_data:
newyork_data = json.load(json_data)
neighborhoods_data = newyork_data['features']
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude']
# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)
for data in neighborhoods_data:
borough = neighborhood_name = data['properties']['borough']
neighborhood_name = data['properties']['name']
neighborhood_latlon = data['geometry']['coordinates']
neighborhood_lat = neighborhood_latlon[1]
neighborhood_lon = neighborhood_latlon[0]
neighborhoods = neighborhoods.append({'Borough': borough,
'Neighborhood': neighborhood_name,
'Latitude': neighborhood_lat,
'Longitude': neighborhood_lon}, ignore_index=True)
Filter the dataframe for neighborhoods of Manhattan
manhattan_data = neighborhoods[neighborhoods['Borough'] == 'Manhattan'].reset_index(drop=True)
manhattan_data
get latitude an longitude of manhattan with geopy
address = 'Manhattan, NY'
geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Manhattan are {}, {}.'.format(latitude, longitude))
create a folium map of New York and mark all Manhattan neigborhoods and the center of Manhattan in it!
# create map of Manhattan using latitude and longitude values
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=13)
folium.CircleMarker([latitude, longitude], radius=7, color='orange', fill=True, fill_color='orange', fill_opacity=1).add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=1000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=3000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=5000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=7000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=9000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=11000, fill=False, color='white').add_to(map_manhattan)
# add markers to map
for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
label = folium.Popup(label, parse_html=True)
folium.CircleMarker(
[lat, lng],
radius=5,
popup=label,
color='blue',
fill=True,
fill_color='#3186cc',
fill_opacity=0.7,
parse_html=False).add_to(map_manhattan)
map_manhattan
Define dataframe with all neighborhoods, latitude, longitude, distance to center of Manhattan, x, y
def lonlat_to_xy(lon, lat):
proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
xy = pyproj.transform(proj_latlon, proj_xy, lon, lat)
return xy[0], xy[1]
def xy_to_lonlat(x, y):
proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
return lonlat[0], lonlat[1]
def calc_xy_distance(x1, y1, x2, y2):
dx = x2 - x1
dy = y2 - y1
return math.sqrt(dx*dx + dy*dy)
#calculate distances from center
distance_from_center=[]
X=[]
Y=[]
manhatten_longitude= longitude
manhatten_latitude=latitude
manhatten_x, manhatten_y= lonlat_to_xy(manhatten_longitude,manhatten_latitude)
for i in range(len(manhattan_data)):
neigborhood_x, neigborhood_y= lonlat_to_xy(manhattan_data['Longitude'][i],manhattan_data['Latitude'][i])
distance_from_center.append(calc_xy_distance(manhatten_x, manhatten_y, neigborhood_x, neigborhood_y))
X.append(neigborhood_x)
Y.append(neigborhood_y)
manhattan_data = manhattan_data.drop('Borough', 1)
manhattan_data['X']=X
manhattan_data['Y']=Y
manhattan_data['Distance from Center']=distance_from_center
manhattan_data
Insert all Foursquare credetials
#hidden cell
CLIENT_ID = 'ZZFPNPGKMCMTFXJ03VWM5VB10NGEHUYYFQP3OSKHSAMU5SAS' # your Foursquare ID
CLIENT_SECRET = 'WBXMGAKRY11BE2F5K0RV5VQWGZRNGLIMPPG1XKFJYSNXVCYF' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
Query all restaurants and all indian restaurants for each neighborhood from Foursquare API
def getNearbyVenues(names, latitudes, longitudes, category, radius=500, LIMIT=200):
venues_list=[]
for name, lat, lng in zip(names, latitudes, longitudes):
print(name)
# create the API request URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&categoryId={}&radius={}&limit={}'.format(
CLIENT_ID,
CLIENT_SECRET,
VERSION,
lat,
lng,
category,
radius,
LIMIT)
# make the GET request
results = requests.get(url).json()["response"]['groups'][0]['items']
# return only relevant information for each nearby venue
venues_list.append([(
name,
lat,
lng,
v['venue']['name'],
v['venue']['location']['lat'],
v['venue']['location']['lng'],
v['venue']['categories'][0]['name']) for v in results])
nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
nearby_venues.columns = ['Neighborhood',
'Neighborhood Latitude',
'Neighborhood Longitude',
'Venue',
'Venue Latitude',
'Venue Longitude',
'Venue Category']
return(nearby_venues)
manhattan_restaurants = getNearbyVenues(names=manhattan_data['Neighborhood'],
latitudes=manhattan_data['Latitude'],
longitudes=manhattan_data['Longitude'],category='4d4b7105d754a06374d81259', radius=500, LIMIT=200
)
manhattan_indian_restaurants = getNearbyVenues(names=manhattan_data['Neighborhood'],
latitudes=manhattan_data['Latitude'],
longitudes=manhattan_data['Longitude'],category='4bf58dd8d48988d10f941735', radius=500, LIMIT=200
)
print(manhattan_restaurants.shape)
manhattan_restaurants.head(20)
print(manhattan_indian_restaurants.shape)
manhattan_indian_restaurants.head(20)
print('Total number of restaurants in Manhattan:', len(manhattan_restaurants))
print('Total number of Indian restaurants in Manhatten:', len(manhattan_indian_restaurants))
print('Percentage of Indian restaurants in Mahattan: {:.2f}%'.format(len(manhattan_indian_restaurants) / len(manhattan_restaurants) * 100))
Create a folium map to display all restaurants in Manhatten and show them in different colors. Indian restauants in green and other restauratns in red and the center of Manhattan in orange
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=13)
folium.CircleMarker([latitude, longitude], radius=7, color='orange', fill=True, fill_color='orange', fill_opacity=1).add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=1000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=3000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=5000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=7000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=9000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=11000, fill=False, color='white').add_to(map_manhattan)
for lat, lng in zip(manhattan_restaurants['Venue Latitude'], manhattan_restaurants['Venue Longitude']):
folium.CircleMarker(
[lat, lng],
radius=3,
color='red',
fill=True,
fill_color='red',
fill_opacity=0.7,
parse_html=False).add_to(map_manhattan)
for lat, lng in zip(manhattan_indian_restaurants['Venue Latitude'], manhattan_indian_restaurants['Venue Longitude']):
folium.CircleMarker(
[lat, lng],
radius=3,
color='green',
fill=True,
fill_color='green',
fill_opacity=0.7,
parse_html=False).add_to(map_manhattan)
map_manhattan
Now we developed a feeling for the data.
We have gathered all the information we need to do our further analysis.
This concludes the Data preparation phase and now we can continue with the analysis of the data to find the most promising neighborhoods.
The goal of this project is to detect the most promising areas of Manhattan where to open up a Indian restaurant.
In the first step I want so see if I can identify some areas in Manhattan with low density of restaurants/Indian restaurants that are as close as possible to the center of Manhattan.
Therefore I calculate additional figures for each neighborhood to get a better understanding of the data:
Then I will use heatmaps to visualize:
and choropleth maps to visualize:
In the second step I will use the identified areas and generate a grid of cells for those areas.
For every grid cell I will calculate some figures in order to define how good the location is and to be able to filter them to get a map of all the areas that are promising to open up a Indian restaurant.
For each grid cell the following figures will be calculated:
Then the generated dataframe of all grid cells will be filtered for grid cell where:
In the final step I will generate a heatmap to visualize the filtered list of grid cells which represent a map of all the promising locations to open up a Indian restaurant in Manhattan.
Lets start the analysis with identify some areas in Manhattan with low density of restaurants/Indian restaurants that are as close as possible to the center of Manhattan therefore lets derive some additional data from our prepared dataset.
First we need the number of restaurants and the number of Indian restaurants in every neighborhood.
#get the total number of restaurants in each neighborhood
restaurants_count=manhattan_restaurants['Neighborhood'].value_counts()
restaurants_count = pd.DataFrame([restaurants_count])
restaurants_count=restaurants_count.transpose().reset_index()
restaurants_count.columns =['Neighborhood','Count']
#get the total number of Indian restaurants in each neighborhood
indian_restaurants_count=manhattan_indian_restaurants['Neighborhood'].value_counts()
indian_restaurants_count = pd.DataFrame([indian_restaurants_count])
indian_restaurants_count=indian_restaurants_count.transpose().reset_index()
indian_restaurants_count.columns =['Neighborhood','Count']
restaurants_count.head()
indian_restaurants_count.head()
manhattan_data.head()
manhattan_data_v2=manhattan_data
manhattan_data_v2['Number of Restaurants']=manhattan_data_v2.Neighborhood.map(restaurants_count.set_index('Neighborhood')['Count'].to_dict())
manhattan_data_v2['Number of Indian Restaurants']=manhattan_data_v2.Neighborhood.map(indian_restaurants_count.set_index('Neighborhood')['Count'].to_dict())
manhattan_data_v2['Number of Indian Restaurants'].fillna(0, inplace=True)
manhattan_data_v2.head()
Next we calculate the percentage of Indian restaurants in each neighborhood.
Percentage=[]
for i in range(len(manhattan_data_v2['Neighborhood'].unique())):
Percentage.append(round(manhattan_data_v2['Number of Indian Restaurants'][i]/manhattan_data_v2['Number of Restaurants'][i],2))
manhattan_data_v2['Percentage of Indian Restaurants']=Percentage
manhattan_data_v2.head()
Now we calculate the distance of the center of a neighborhood to the next Indian restaurant.
Distances=[]
for i in range(len(manhattan_data_v2['Neighborhood'].unique())):
shortest_distance=None
latitude_neighborhood=manhattan_data_v2['Latitude'][i]
longitude_neighborhood=manhattan_data_v2['Longitude'][i]
#calculate x, y of neighborhood
x_neigh, y_neigh=lonlat_to_xy(longitude_neighborhood,latitude_neighborhood)
for s in range(manhattan_indian_restaurants.shape[0]):
latitude_restaurant=manhattan_indian_restaurants['Venue Latitude'][s]
longitude_restaurant=manhattan_indian_restaurants['Venue Longitude'][s]
#calculate x, y of Indian restaurant
x_rest, y_rest=lonlat_to_xy(longitude_restaurant,latitude_restaurant)
#calculate distance.
dist = calc_xy_distance(x_neigh, y_neigh, x_rest, y_rest)
if shortest_distance==None:
shortest_distance=dist
elif dist<shortest_distance:
shortest_distance=dist
Distances.append(round(shortest_distance,2))
manhattan_data_v2['Distance to Indian Restaurants from Center']=Distances
manhattan_data_v2
print('On average the distance from the center of a neighborhod to the closest Indian restaurant is: ', manhattan_data_v2['Distance to Indian Restaurants from Center'].mean())
Lets visualize the density of restaurants in Manhattan with a heatmap.
Red means higher density.
The blue dots represent the center of each neighborhood.
from folium import plugins
from folium.plugins import HeatMap
manhattan_neighborhoods_url = 'https://raw.githubusercontent.com/ibuilder/NYCPolyline/master/manhattan.geojson'
manhattan_neighborhoods = requests.get(manhattan_neighborhoods_url).json()
def boroughs_style(feature):
return { 'color': 'blue', 'fill': False }
restaurant_latlons=manhattan_restaurants['Venue Latitude'].to_frame()
restaurant_latlons['Venue Longitude']=manhattan_restaurants['Venue Longitude']
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=13)
folium.CircleMarker([latitude, longitude], radius=7, color='orange', fill=True, fill_color='orange', fill_opacity=1).add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=1000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=3000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=5000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=7000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=9000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=11000, fill=False, color='white').add_to(map_manhattan)
HeatMap(restaurant_latlons).add_to(map_manhattan)
for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
label = folium.Popup(label, parse_html=True)
folium.CircleMarker(
[lat, lng],
radius=5,
popup=label,
color='blue',
fill=True,
fill_color='#3186cc',
fill_opacity=0.7,
parse_html=False).add_to(map_manhattan)
folium.GeoJson(manhattan_neighborhoods, style_function=boroughs_style, name='geojson').add_to(map_manhattan)
map_manhattan
Now lets visualize the density of Indian restaurants in Manhattan with a heatmap.
Red meand higher density.
The blue dots represent the center of each neighborhood.
indian_restaurant_latlons=manhattan_indian_restaurants['Venue Latitude'].to_frame()
indian_restaurant_latlons['Venue Longitude']=manhattan_indian_restaurants['Venue Longitude']
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=13)
folium.CircleMarker([latitude, longitude], radius=7, color='orange', fill=True, fill_color='orange', fill_opacity=1).add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=1000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=3000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=5000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=7000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=9000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=11000, fill=False, color='white').add_to(map_manhattan)
HeatMap(indian_restaurant_latlons).add_to(map_manhattan)
for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
label = folium.Popup(label, parse_html=True)
folium.CircleMarker(
[lat, lng],
radius=5,
popup=label,
color='blue',
fill=True,
fill_color='#3186cc',
fill_opacity=0.7,
parse_html=False).add_to(map_manhattan)
folium.GeoJson(manhattan_neighborhoods, style_function=boroughs_style, name='geojson').add_to(map_manhattan)
map_manhattan
Now lets visualize both heatmaps together to see if we can spot areas near the center of Manhattan with low density of restaurants an low density of Indian restaurants.
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=13)
folium.CircleMarker([latitude, longitude], radius=7, color='orange', fill=True, fill_color='orange', fill_opacity=1).add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=1000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=3000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=5000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=7000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=9000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=11000, fill=False, color='white').add_to(map_manhattan)
HeatMap(restaurant_latlons).add_to(map_manhattan)
HeatMap(indian_restaurant_latlons).add_to(map_manhattan)
for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
label = folium.Popup(label, parse_html=True)
folium.CircleMarker(
[lat, lng],
radius=5,
popup=label,
color='blue',
fill=True,
fill_color='#3186cc',
fill_opacity=0.7,
parse_html=False).add_to(map_manhattan)
folium.GeoJson(manhattan_neighborhoods, style_function=boroughs_style, name='geojson').add_to(map_manhattan)
map_manhattan
When having a look on the density of restaurants/Indian restaurants in Manhattan we can see that there are a few spaces with low density close to the centre of Manhattan.
In the close area around the center of Manhattan:
A bit more away:
So as we can see the areas wich have a overall low density of restaurants are matching the areas with a low density of Indian restaurants quiete well in the closer area around the center of Manhattan.
We can see as well that the Heatmap of Indian restaurants is not that hot in general. With a overall share of round about 10 % the share of Indian restaurants is not that high in Manhattan.
Unfortuanatly we can see that the Neighborhoods in the New_York_Dataset and the Geojson file are not matching perfectly.
Some Neighborhoods are named differently of put together in the geojson file.
Now lets visualize the share of Indian restaurants in Manhattan with a choropleth map.
The share is colorcoded starting with a low share in Yellow increasing to a higher share in Red.
The blue dots represent the center of each neighborhood.
newyork_geo = r'https://raw.githubusercontent.com/ibuilder/NYCPolyline/master/manhattan.geojson'
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=13)
folium.CircleMarker([latitude, longitude], radius=7, color='orange', fill=True, fill_color='orange', fill_opacity=1).add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=1000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=3000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=5000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=7000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=9000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=11000, fill=False, color='white').add_to(map_manhattan)
map_manhattan.choropleth(
geo_data=newyork_geo,
data=manhattan_data_v2,
columns=['Neighborhood', 'Percentage of Indian Restaurants'],
key_on='feature.properties.neighborhood',
fill_color='YlOrRd',
fill_opacity=0.7,
line_opacity=0.2,
legend_name='Percentage of Indian Restaurants'
)
for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
label = folium.Popup(label, parse_html=True)
folium.CircleMarker(
[lat, lng],
radius=5,
popup=label,
color='blue',
fill=True,
fill_color='#3186cc',
fill_opacity=0.7,
parse_html=False).add_to(map_manhattan)
map_manhattan
Unfortuanatly the geojson file and the new_york_dataset doesnt match perfectly.
The naming of the neighborhoods is sometimes slitley different and the centers of neighborhoods sometimes doesnt match the geojson file.
eg.
hamilton heights, manhattenville, central harlem = Harlem in geojson
central park doesn exist in new_york_dataset
Hudson Yards, Clinton = Hells Kitchen, Theater District in geojson ...
Now lets visualize the distance from the center of a Neigborhood to the next Indian restaurant in Manhattan with a choropleth map.
The share is colorcoded starting with a low share in Yellow increasing to a higher share in Red.
The blue dots represent the center of each neighborhood.
newyork_geo = r'https://raw.githubusercontent.com/ibuilder/NYCPolyline/master/manhattan.geojson'
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=13)
folium.CircleMarker([latitude, longitude], radius=7, color='orange', fill=True, fill_color='orange', fill_opacity=1).add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=1000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=3000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=5000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=7000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=9000, fill=False, color='white').add_to(map_manhattan)
folium.Circle(location=[latitude, longitude], radius=11000, fill=False, color='white').add_to(map_manhattan)
map_manhattan.choropleth(
geo_data=newyork_geo,
data=manhattan_data_v2,
columns=['Neighborhood', 'Distance to Indian Restaurants from Center'],
key_on='feature.properties.neighborhood',
fill_color='YlOrRd',
fill_opacity=0.7,
line_opacity=0.2,
legend_name='Percentage of Indian Restaurants'
)
for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
label = folium.Popup(label, parse_html=True)
folium.CircleMarker(
[lat, lng],
radius=5,
popup=label,
color='blue',
fill=True,
fill_color='#3186cc',
fill_opacity=0.7,
parse_html=False).add_to(map_manhattan)
map_manhattan
Unfortunately we can´t take that much information out of the choropleth maps cause the areas we identified in the heatmaps are exactly the ones that are named differently in the geojson file. So especaly for those areas we cant see any information in the choropleth maps.
Lets focus on the following tho areas to generate a grid of cells to evaluate each location in more detail.
center_manhattan=[latitude, longitude]
focus_area1=[40.762849, -73.980685]
focus_area2=[40.802038, -73.948810]
map_manhattan = folium.Map(location=center_manhattan, zoom_start=13)
HeatMap(restaurant_latlons).add_to(map_manhattan)
HeatMap(indian_restaurant_latlons).add_to(map_manhattan)
#folium.GeoJson(manhattan_neighborhoods, style_function=boroughs_style, name='geojson').add_to(map_manhattan)
folium.Marker(center_manhattan).add_to(map_manhattan)
folium.Circle(focus_area1, radius=1500, color='white', fill=True, fill_opacity=0.4).add_to(map_manhattan)
folium.Circle(focus_area2, radius=1100, color='white', fill=True, fill_opacity=0.4).add_to(map_manhattan)
map_manhattan
Lets proceed with the second step of our analysis.
Lets define define a grid of cells that cover the areas we identified before.
# define focus areas
focus_area1=[40.762849, -73.980685]
#focus_area2=[40.80451, -73.946072]
#focus_area2=[40.803429, -73.947583]
focus_area2=[40.802038, -73.948810]
# define area 1
lat1_min=focus_area1[0]-0.008
lon1_min=focus_area1[1]-0.014
lat1_max=focus_area1[0]+0.008
lon1_max=focus_area1[1]+0.014
# define area 2
lat2_min=focus_area2[0]-0.008*1300/1500
lon2_min=focus_area2[1]-0.014*1300/1500
lat2_max=focus_area2[0]+0.008*1300/1500
lon2_max=focus_area2[1]+0.014*1300/1500
#corner points of area 1
point1=[lat1_min,lon1_min]
point2=[lat1_max,lon1_min]
point3=[lat1_max,lon1_max]
point4=[lat1_min,lon1_max]
#corner points of area 2
point5=[lat2_min,lon2_min]
point6=[lat2_max,lon2_min]
point7=[lat2_max,lon2_max]
point8=[lat2_min,lon2_max]
#define lists for latitudes and longitudes
focus_area_latitudes=[]
focus_area_longitudes=[]
#define a grid of points in area1
stepwith=0.0012
steps1_lat=int(round((lat1_max-lat1_min)/stepwith,0))
steps1_lon=int(round((lon1_max-lon1_min)/stepwith,0))
long=lon1_min
for i in range(steps1_lon):
long=long+stepwith
lati=lat1_min
for s in range(steps1_lat):
lati=lati+stepwith
focus_area_latitudes.append(lati)
focus_area_longitudes.append(long)
#define a grid of points in area2
steps2_lat=int(round((lat2_max-lat2_min)/stepwith,0))
steps2_lon=int(round((lon2_max-lon2_min)/stepwith,0))
long=lon2_min
for i in range(steps2_lon):
long=long+stepwith
lati=lat2_min
for s in range(steps2_lat):
lati=lati+stepwith
focus_area_latitudes.append(lati)
focus_area_longitudes.append(long)
print(str(len(focus_area_latitudes))+" grid points generated!")
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=13)
folium.Marker(center_manhattan).add_to(map_manhattan)
for lat, lng, in zip(focus_area_latitudes, focus_area_longitudes):
folium.CircleMarker(
[lat, lng],
radius=1,
color='blue',
fill=True,
fill_color='#3186cc',
fill_opacity=0.7,
parse_html=False).add_to(map_manhattan)
HeatMap(restaurant_latlons).add_to(map_manhattan)
HeatMap(indian_restaurant_latlons).add_to(map_manhattan)
folium.GeoJson(manhattan_neighborhoods, style_function=boroughs_style, name='geojson').add_to(map_manhattan)
map_manhattan
Looks great. The grids cover most of the free space nearby the center of Manhattan where there is a low density of restaurants and Indian restaurants as well.
Now lets build a dataframe of all those points and calculate all the important figures for them:
restaurants_nearby=[]
distance_next_indian=[]
distance_center_manhattan=[]
for i in range(len(focus_area_latitudes)): #539
#calculate x,y of grid point
x_grid, y_grid = lonlat_to_xy(focus_area_longitudes[i], focus_area_latitudes[i])
count=0
shortest_distance=None
#calculate number of restaurants in area of 250m around
for s in range(len(restaurant_latlons)):
x_restaurant, y_restaurant = lonlat_to_xy(restaurant_latlons['Venue Longitude'][s], restaurant_latlons['Venue Latitude'][s])
distance=calc_xy_distance(x_grid, y_grid, x_restaurant, y_restaurant)
if distance<250:
count=count+1
restaurants_nearby.append(count)
#calculate distance to next Indian restaurant
for k in range(len(indian_restaurant_latlons)):
x_restaurant, y_restaurant = lonlat_to_xy(indian_restaurant_latlons['Venue Longitude'][k], indian_restaurant_latlons['Venue Latitude'][k])
dist=calc_xy_distance(x_grid, y_grid, x_restaurant, y_restaurant)
if shortest_distance==None:
shortest_distance=dist
elif dist<shortest_distance:
shortest_distance=dist
distance_next_indian.append(round(shortest_distance,0))
#calculate distance to center of manhattan
x_center_manhattan, y_center_manhattan = lonlat_to_xy(center_manhattan[1], center_manhattan[0])
dist=calc_xy_distance(x_grid, y_grid, x_center_manhattan, y_center_manhattan)
distance_center_manhattan.append(round(dist,0))
grid_df=pd.DataFrame({'Latitude':focus_area_latitudes,
'Longitude':focus_area_longitudes,
'Restaurants nearby':restaurants_nearby,
'Distance next Indian Restaurant':distance_next_indian,
'Distance to Center':distance_center_manhattan})
grid_df.head()
lets filter the restaurants. We are interested in locations with no restaurant within a radius of 250m and no Indian restaurant in a radius of 500m.
good_res_count = np.array((grid_df['Restaurants nearby']<=0))
print('Locations with no more than two restaurants nearby:', good_res_count.sum())
good_ind_distance = np.array(grid_df['Distance next Indian Restaurant']>=500)
print('Locations with no Indian restaurants within 500m:', good_ind_distance.sum())
good_locations = np.logical_and(good_res_count, good_ind_distance)
print('Locations with both conditions met:', good_locations.sum())
df_good_locations = grid_df[good_locations]
df_good_locations.head()
Lets visualize the grid points with no restaurant within 250m and no Indian restaurant within 500m.
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=13)
folium.Marker(center_manhattan).add_to(map_manhattan)
for lat, lng, in zip(df_good_locations['Latitude'], df_good_locations['Longitude']):
folium.CircleMarker(
[lat, lng],
radius=1,
color='blue',
fill=True,
fill_color='#3186cc',
fill_opacity=0.7,
parse_html=False).add_to(map_manhattan)
HeatMap(restaurant_latlons).add_to(map_manhattan)
HeatMap(indian_restaurant_latlons).add_to(map_manhattan)
folium.GeoJson(manhattan_neighborhoods, style_function=boroughs_style, name='geojson').add_to(map_manhattan)
map_manhattan
Looks good. The remaining grid cells are perfectly matching inbetween the Heatmap of restaurants and indian restaurants.
Lets visualize a heatmap of the good locations that are matching the criteria of no restaurant in a distance of 250m and no Indian restaurant within an radius of 500m.
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=13)
folium.Marker(center_manhattan).add_to(map_manhattan)
HeatMap(pd.DataFrame({'Latitude':df_good_locations['Latitude'],
'Longitude':df_good_locations['Longitude']})).add_to(map_manhattan)
folium.GeoJson(manhattan_neighborhoods, style_function=boroughs_style, name='geojson').add_to(map_manhattan)
map_manhattan
The map represents the final result. It visualizes all the promissing areas close to the center of Manhattan to open up a Indian restaurant.
Be aware that the part of the heatmap overlaping with the central park needs to be ignored cause there it is obviously not possible to open up a restaurant.
The analysis shows some areas close to the center of Manhattan where the density of restaurants/Indian restaurants is low even if you can find nearly 3000 restaurants in Manhattan.
The analysis presents two areas where you wont find any Indian restaurant within at least 500m radius and where there are no restaurants in at least 250m of radius.
From a perspective of competition the analysis is able to present two queit lage areas where it might be interesting to open up a Indian restaurant but it doen´t take into account if the rent is affordable or if there are spaces available to open up a restaurant or if it is a attractive neighborhood.
The purpose of this analysis was to present attractive locations to the stakeholders to open up a Indian restaurant in Manhattan.
Therefore the analysis used data science to calculate the density of restaurants/Indian restaurants. By visualizing those densities we were able to identify two areas quiet close to the center of Manhattan where the density of restaurants/Indian restaurants is very low.
This analyis will build the foundation for stakeholders for making a descision where to open up a Indian restaurant. For the descision additional factors needs to be taken into account like for examble the rent, if there are available locations for a restaurant, the population density and the overall attractiveness of the neighborhood.